
# Deep Clustering and Representation Learning (DCRL)




The code includes the following modules:
* Datasets (MNIST-full, MNIST-test, USPS, Fashion-MNIST, Reuters-10k, HAR)
* Training for DCRL
* Evaluation metrics 
* Visualisation
* The compared methods include: DEC and LDEC

## Requirements

* pytorch == 1.3.1
* scipy == 1.3.1
* numpy == 1.18.5
* scikit-learn == 0.21.3
* matplotlib == 3.1.1

## Description

* main.py  
  * pretrain() -- Pretraining the model with self-reconstruction Loss
  * train() -- End-to-end training of the DCRL model
  * test() -- Test generalization performance on out-of-sample (testing sample)
* autotrain.py -- Scripts for automatic testing on five datasets
* dataset.py  
  * Dataset() -- Load data of selected dataset
* evaluation.py  
  * GetIndicator() -- Auxiliary tool for evaluating metric 
* loss.py  
  * Loss_alpha() -- Calculate losses: ℒ<sub>LIS</sub>, ℒ<sub>rank</sub>, ℒ<sub>AE</sub>, ℒ<sub>align</sub> 

* model.py  
  * AutoEncoder() -- The architecture used in this work
  * LDEC() -- Calculation *Q* distribution and *P* distribution
* utils.py  
  * visualize() -- Auxiliary tools for visualizing intermediate results
  * Clustering() -- For initializing the clustering centers

## Running the code

1. Install the required dependency packages

2. To get the results on five datasets, run

  ```
python autotrain.py
  ```

3. To get the metrics and visualisation, refer to

  ```
../plots/dataset/pics/
  ```
where the *dataset* is one of the six datasets (MNIST-full, MNIST-test, USPS, Fashion-MNIST, Reuters-10k, HAR)
